Incremental Learning of Variable Rate Concept Drift
نویسندگان
چکیده
We have recently introduced an incremental learning algorithm, Learn.NSE, for Non-Stationary Environments, where the data distribution changes over time due to concept drift. Learn.NSE is an ensemble of classifiers approach, training a new classifier on each consecutive batch of data that become available, and combining them through an age-adjusted dynamic error based weighted majority voting. Prior work has shown the algorithm’s ability to track gradually changing environments as well as its ability to retain former knowledge in cases of cyclical or recurring data by retaining and appropriately weighting all classifiers generated thus far. In this contribution, we extend the analysis of the algorithm to more challenging environments experiencing varying drift rates; but more importantly we present preliminary results on the ability of the algorithm to accommodate addition or subtraction of classes over time. Furthermore, we also present comparative results of a variation of the algorithm that employs an error-based pruning in cyclical environments.
منابع مشابه
Detecting Concept Drift in Data Stream Using Semi-Supervised Classification
Data stream is a sequence of data generated from various information sources at a high speed and high volume. Classifying data streams faces the three challenges of unlimited length, online processing, and concept drift. In related research, to meet the challenge of unlimited stream length, commonly the stream is divided into fixed size windows or gradual forgetting is used. Concept drift refer...
متن کاملDynamic Weighted Majority for Incremental Learning of Imbalanced Data Streams with Concept Drift
Concept drifts occurring in data streams will jeopardize the accuracy and stability of the online learning process. If the data stream is imbalanced, it will be even more challenging to detect and cure the concept drift. In the literature, these two problems have been intensively addressed separately, but have yet to be well studied when they occur together. In this paper, we propose a chunk-ba...
متن کاملDynamic Programming for Bayesian Logistic Regression Learning under Concept Drift
A data stream is an ordered sequence of training instances arriving at a rate that does not permit to permanently store them in memory and leads to the necessity of online learning methods when trying to predict some hidden target variable. In addition, concept drift often occurs, what means means that the statistical properties of the target variable may change over time. In this paper, we pre...
متن کاملInfo-fuzzy algorithms for mining dynamic data streams
Most data mining algorithms assume static behavior of the incoming data. In the real world, the situation is different and most continuously collected data streams are generated by dynamic processes, which may change over time, in some cases even drastically. The change in the underlying concept, also known as concept drift, causes the data mining model generated from past examples to become le...
متن کاملIncremental Discretization and Bayes Classifiers Handles Concept Drift and Scales Very Well
Many data sets exhibit an early plateau where the performance of a learner peaks after seeing a few hundred (or less) instances. When concepts drift slower than the time to find that plateau, then a simple windowing policy and an incremental discretizer lets standard learners like Naı̈veBayes classifiers scale to very large data sets. Our toolkit is simple to implement, can scale to millions of ...
متن کامل